Syllable based Lattice Transduction for Keyword Search
نویسندگان
چکیده
Low-resource speech recognition and keyword search (KWS) are important topics for speech technologies. However, their performance often suffers from out-ofvocabulary (OOV) keywords. Subword units like syllables are useful in handling this issue. This report introduces a weighted finite state transducer (WFST) based syllable transduction framework for OOV handling in KWS. Syllable lattices are generated by performing syllable decoding and OOV keywords are entered into a pronunciation dictionary using a word-to-syllable pronunciation prediction system. Syllable lattices are then transduced into word lattices using both in-vocabulary word pronunciations and OOV pronunciations. Experiments on 5 languages provided by IARPA Babel project 1 are presented, and it is shown that syllable transduction can effectively spot OOV keywords. Combination of this approach with two other OOV handling methods further improves keyword search performance. 1Supported by the Intelligence Advanced Research Projects Activity (IARPA) via Department of Defense US Army Research Laboratory contract number W911NF-12-C-0014. The U.S. Government is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright annotation thereon. Disclaimer: The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of IARPA, DoD/ARL, or the U.S. Government.
منابع مشابه
Syllable Based Audio Search Using Confusion Network Arc as Indexing Unit
Compared to English, Chinese has a simpler and more restricted syllabic structure. In order to exploit the special characteristics of Chinese, syllable is selected as the unit for ASR lattice representation. For the sake of fast retrieval, syllable lattices are clustered into confusion network linear lattices, and then encoded into inverted index. To recover the posterior probabilities of prune...
متن کاملKeyword Spotting by Searching the Syllable Lattices
This paper presents a keyword spotting method based on searching a syllable lattice structure. The Mandarin syllables are represented in initial-final models. By one-stage dynamic programming, an utterance is converted into a sequence of topN-candidate syllables. It comes out a syllable lattice structure for this input utterance. A vocabulary of predefined keywords is represented as a set of sy...
متن کاملMulti-keyword spotting of telephone speech using a fuzzy search algorithm and keyword-driven two-level CBSM
In telephone speech recognition, the acoustic mismatch between training and testing environments often causes a severe degradation in the recognition performance. This paper presents a keyword-driven two-level codebook-based stochastic matching (CBSM) algorithm to eliminate the acoustic mismatch. Additionally, in Mandarin speech, it is dicult to correctly recognize the unvoiced part in a sylla...
متن کاملRealtime Viterbi Searching for Practical Telephone Speech Recognition Systems
This paper studies searching and pruning process of the telephone speech recognition system for Private Automatic Branch Exchange (PABX) to explore the possible problems encountered in applying speech recognition to telephone network and to prepare the necessary techniques for the practical telephone speech recognition systems. Experiment on a baseline system which uses semi-syllable based mult...
متن کاملAn Effective Path-aware Approach for Keyword Search over Data Graphs
Abstract—Keyword Search is known as a user-friendly alternative for structured languages to retrieve information from graph-structured data. Efficient retrieving of relevant answers to a keyword query and effective ranking of these answers according to their relevance are two main challenges in the keyword search over graph-structured data. In this paper, a novel scoring function is proposed, w...
متن کامل